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The adaptive immune system relies on the diversity of receptors expressed on the surface of B and 
T-cells to protect the organism from a vast amount of pathogenic threats. The proliferation and 
degradation dynamics of different cell types (B cells, T cells, naive, memory) is governed by a variety 
of antigenic and environmental signals, yet the observed clone sizes follow a universal power law 
distribution. Guided by this reproducibility we propose effective models of somatic evolution where 
cell fate depends on an effective fitness. This fitness is determined by growth factors acting either 
on clones of cells with the same receptor responding to specific antigens, or directly on single cells 
with no regards for clones. We identify fluctuations in the fitness acting specifically on clones as the 
essential ingredient leading to the observed distributions. Combining our models with experiments 
we characterize the scale of fluctuations in antigenic environments and we provide tools to identify 
the relevant growth signals in different tissues and organisms. Our results generalize to any evolving 
population in a fluctuating environment. 


I. INTRODUCTION 

Antigen-specific receptors expressed on the membrane 
of B and T cells (BCRs and TCRs ) recognize pathogens 
and initiate the adaptive immune response [T. An ef¬ 
ficient response relies on the large diversity of receptors 
that protect us against the many pathogens in the envi¬ 
ronment. The adaptive repertoire diversity is maintained 
from a source of newly generated cells, each expressing 
a unique receptor. These progenitor cells later divide 
or die, and their offspring make up clones of cells that 
share a common receptor. The sizes of clones vary, as 
they depend on the particular history of cell divisions 
and deaths in the clone. The clone size distribution thus 
bears signatures of the challenges faced by the adaptive 
system. Understanding the form of the clone size distri¬ 
bution in healthy individuals is an important step in the 
characterization of the antigenic recognition process and 
the functioning of the adaptive immune system. It also 
presents an important starting point for describing sta¬ 
tistical deviations seen in individuals with compromised 
immune responses. 

High throughput sequencing experiments in different 
cell types and species have allowed for the quantification 
of clone sizes and their distributions |31]. Generically, 
clone size distributions exhibit heavy tails |2] [QlfTT]—the 
number of clones of a given size is well described by a 
power law over a wide range of clone sizes (see Fig. Eh 
B). Such universal behavior for a variety of cells types 
in different species (B cells, T cells, naive cells, effec¬ 
tor cells in fish, mice, and humans) suggests common 
underlying rules for growth, death and homeostatic con¬ 
trol of adaptive repertoires. Previous work has described 
the specifics of immune dynamics for a certain cell type 
fl2l 13:, or a certain signaling pathway, using detailed 
mechanistic models [HOI. It remains unclear, how¬ 
ever, what essential features of these dynamics may lead 
to the observed power-law distributions, and what are 
the key biological parameters of the repertoire dynamics 
that govern its behavior. 


The wide range and types of interactions that influence 
a B or T cell fate happen in a complex, dynamical envi¬ 
ronment with inhomogeneous spatial distributions. They 
are difficult to measure in vivo , making their quantitative 
characterization elusive. To overcome this, we describe 
the effective interaction between the immune cells and 
their environment as a stochastic process governed by 
only a few relevant parameters. This effective descrip¬ 
tion is consistent with the idea, common in physics, that 
the apparent universality of the observed distribution is 
underlied by broad model universality classes. Since the 
heavy tail behavior exists in both B and T cells, we con¬ 
sider general properties that are common to both cell 
types, and ignore hypermutations for simplicity. 

All cells proliferate and die depending on the strength 
of antigenic and cytokine signals they receive from the 
environment, which together determine their net growth 
rate. This effective fitness that fluctuates in time is cen¬ 
tral to our description. We find that its general prop¬ 
erties determine the form of the clone size distribution. 
Two broad classes of models are distinguished, according 
to whether these fitness fluctuations are clone specific 
(mediated by their specific BCR or TCR) or cell specific 
(mediated by phenotypic fluctuations such as the num¬ 
ber of cytokine receptors). We identify the models that 
are compatible with the experimentally observed distri¬ 
butions of clone sizes. These distributions do not depend 
on the detailed mechanisms of cell signaling and growth, 
but rather emerge as a result of self-organisation, with 
no need for fine-tuned interactions. 


II. RESULTS 

A. Clone dynamics in a fluctuating antigenic 
landscape 

The fate of the cells of the adaptive immune system 
depends on a variety of clone-specific stimulations. The 
recognition of pathogens triggers large events of fast clone 
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proliferation followed by a relative decay, with some cells 
being stored as memory cells to fend off future infections. 
Naive cells, which have not yet recognized an antigen, 
do not usually undergo such extreme events of prolifera¬ 
tion and death, but their survival relies on short binding 
events (called “tickling”) to antigens that are natural to 
the organism (self-proteins) [17, 18]. Because receptors 
are conserved throughout the whole clone (ignoring B 
cell hypermutations), clones that are better at recogniz¬ 
ing self-antigens and pathogens will on average grow to 
larger populations than bad binders. By analogy to Dar¬ 
winian evolution, they are “fitter” in their local, time- 
varying environment mmm- 

We denote by aj ( t ) the overall concentration of an anti¬ 
gen j as a function of time. We assume that after its 
introduction at a random time tj, this concentration de¬ 
cays exponentially with a characteristic lifetime of anti¬ 
gens A -1 , cij(t ) = ay as pathogens are cleared 
out of the organism, either passively or through the ac¬ 
tion of the immune response. Lymphocyte receptors are 
specific to certain antigens, but this specificity is degener¬ 
ate, a phenomenon refered to as cross-reactivity or poly¬ 
specificity. The extend to which a lymphocyte express¬ 
ing receptor i interacts with antigen j is encoded in the 
cross-reactivity function Kij, which is zero if i and j do 
not interact, or a positive number drawn from a distribu¬ 
tion to be specified, if they do. In general, interactions 
between lymphocytes and antigens effectively promote 
growth and suppress cell death, but for simplicity we can 
assume that the effect is restricted to the division rate. 
In a linear approximation, this influence is proportional 
to V . Kija,j(t ), i.e. the combined effect of all antigens j 
for which clone i is specific. This leads to the following 
dynamics for the evolution of the size C{ of clone i : 


da 

dt 


' + X Ki o a o( t ) - M ] C* + 


( 1 ) 


where v and fi are the basal division and death rates, 
and where B£i(t) is a birth-death noise of intensity 
B 2 = (y + Kij a j{t) + aOQ, with ^(t) a unit Gaus¬ 
sian white noise (see Appendix [A] for details about birth 
death noise). 

New clones, with a small typical initial size Co, are 
constantly produced and released into the periphery with 
rate sc • For example, a number of the order of sc = 10 8 
new T-cells are output by the thymus daily in humans 
[2Ij . Since the total number of T cells is of the order 
of 10 11 , this means that cells die with an average rate 
of 10 -3 days -1 in homeostatic conditions m ■ Because 
the probability of rearranging the exact same receptor 
independently is very low (< 10 -10 ) [22], we assume that 
each new clone is unique and comes with its own set of 
cross-reactivity coefficients Kij. Assuming a rate sa of 
new antigens, the average net growth rate in Eq. [l] is 
/o = ^ + (%,o)(A")<saA -1 — fi < 0, and the stationary 
number of clones should fluctuate around Nq ~ sc'|/o| -1 
clones. This is just an average however, and treating each 


clone independently may lead to large variations in the 
total number of cells {i.e. the sum of sizes of all clones). 
To maintain a constant population size, clones compete 
with each other for specific resources (pathogens or self¬ 
antigens) and homeostatic control can be maintained by 
a global resource such as Interleukin 7 or Interleukin 2. 
Here we do not model this homeostatic control explicitly, 
but instead assume that the division and death rates z/, p 
are tuned to achieve a given repertoire size. We verified 
that adding an explicit homeostatic control did not affect 
our results (see Fig. S2 and Appendix |B| . 

We simulated the dynamics of a population of clones 
interacting with a large population of antigens. Each 
antigen interacts with each present clone with probabil¬ 
ity p = 10 -7 , and with strength drawn from a Gaus¬ 
sian distribution of mean 1 and variance 1 (truncated to 
positive values). A typical trajectory of the antigenic 
stimulation undergone by a given clone, j Kij a j > is 
shown in Fig. IT (green curve), and shows how clone 
growth tracks the variations of the antigenic environ¬ 
ment. When the stimulation is particularly strong, the 
model recapitulates the typical behaviour experimentally 
observed at the population level following a pathogenic 
invasion [231124] . as illustrated in Fig. 0* the population 
of a clone explodes (red curve), driving the growth of the 
total population (blue curve), while taking over a large 
fraction of the carrying capacity of the system, and then 
decays back as the infection is cleared. 

On average, the effects of division and death almost 
balance each other, with a slight bias towards death be¬ 
cause of the turnover imposed by thymic or bone mar¬ 
row output. However, at a given time, a clone that has 
high affinity for several present antigens will undergo a 
transient but rapid growth, while most other clones will 
decay slowly towards exctinction. In other words, lo¬ 
cally in time, the antigenic environment creates a unique 
“fitness” for each clone. Since growth is exponential in 
time, these differential fitnesses can lead to very large 
differences in clone sizes, even if variability in antigen 
concentrations or affinities are nominally small. We thus 
expect to observe large tails in the distrubution of clone 
size. Fig. IK shows the cumulative probability distri¬ 
bution function (CDF) of clone sizes obtained at steady 
state (blue curve) showing a clear power-law behaviour 
for large clones, spanning several decades. 

The exponent of the power-law is independent of the 
introduction size of clones (see inset of Fig. IK) , and the 
specifics of the randomness in the environment (exponen¬ 
tial decay, random number of partners, random interac¬ 
tion strength) as long as its first and second moment are 
kept fixed (See Fig. S3 and Appendix |C|) . 


B. Simplified models and the origin of the power 

law 

To understand the power-law behavior observed in the 
simulations, and its robustness to various parameters and 
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mean, denoted by This leads to rewriting Eq. [l] as: 

dC- 

= + fMC i {t)+B£ i (t), ( 2 ) 

with B 2 w (|/o| + 2/x)Ci- 

The function f/t) encodes the fluctuations of the en¬ 
vironment as experienced by clone i. Because antigens 
can be recognized by several receptors, these fluctuations 
may be correlated between clones. Assuming that these 
correlations are weak, (')) « 0, amounts to treat¬ 
ing each clone independently of each other, and thus 
to reducing the problem to the single clone level. The 
stochastic process giving rise to f/t) is a sum of Poisson- 
distributed exponentially decaying spikes. This process is 
not easily amenable to analytical treatment, but we can 
replace it with a simpler stochastic process with the same 
temporal autocorrelation function. This autocorrelation 
is given by (fi(t)fi(t')) = A 2 e~ x ^ t ~ t I, with the antigenic 
noise strength A 2 = saP^q(K 2 )A~ 1 , and where we recall 
that A -1 is the characteristic lifetime of antigens. The 
simplest process with the same autocorrelation function 
is given by an overdamped spring in a thermal bath, or 
Ornstein-Uhlenbeck process, 

^ = -A fi + V2-jrn(t), (3) 


FIG. 1: Experimental clone size distributions have heavy tails. 
A. B cell zebrafish experimental cumulative clone size distri¬ 
bution for fourteen fish as a function of the fraction of the 
population occupied by that clone from data in Weinstein et 
al. [2]- B. Clone size distribution for murine T-cells from 
Zarnitsyna et al. [lT] (data plotted as presented in original 
paper). C. The dynamics of adaptive immune cells include 
specific interactions with antigens that promote division and 
prevent cell death. New cells are introduced from the thymus 
or bone marrow with novel, unique receptors. Division, death 
and thymic or bone marrow output on average balance each 
other to create a steady state population. D-E. Example 
trajectories from simulations of the immune cell population 
dynamics in Eq. [l] The total number of cells (D) shows large 
variations after an exceptional event of a large pathogenic in¬ 
vasion. One or a few cells that react to that specific antigen 
grow up to a macroscopic portion of the total population, 
and then decrease back to normal sizes after the invasion. A 
typical clone size trajectory along with its pathogenic stimu¬ 
lation JT Kijdj(t) shows the coupling between clone growth 
and variations of the antigenic environment (E). Parameters 
used: sc — 2000 day -1 , Co = 2, sa = 1-96 • 10 7 day -1 , 
= clo = 1, A = 2 day -1 , p — 10 -7 , v — 0.98 day -1 , 
p — 1.18 day -1 . 


sources of stochasticity, we decompose the overall fitness 
of a clone at a given time (its instantaneous growth rate) 
into a constant, clone-independent part equal to its aver¬ 
age /o < 0, and a clone-specific fluctuating part of zero 


with rji(t) a Gaussian white noise of intensity 1 and 
7 = Ay /A quantifies the strength of variability of the 
antigenic environment (see Appendix Jd|. This is also the 
process of maximum entropy or caliber |25| with that 
autocorrelation function (see Appendix | e| and [26]). 

The effect of the birth death noise B£/\t) is negligible 
when compared to the fitness variations for large clones 
and it has no effect on the tail (see Fig. S5 and Appendix 
0 - It can thus be ignored when looking at the tail of the 
distribution and its power law exponent, but it will play 
an important role for defining the range over which the 
power law is satisfied. 

The population dynamics described by Eqs. [2] and [3] 
can be reformulated in terms of a Fokker-Planck equation 
for the joint abundance p of clones of a given log-size 
x = log C and a given fitness /: 


MxJl) , + f) gg + A g(M + a^ + s(x f) 

dt Uo f, dx df + 7 + s v,u, 


dp 


( 4 ) 

where the source term s(x,f) describes new clones ar¬ 
riving at rate sc with size Co and normally distributed 
fitnesses of variance (/ 2 ) = 7 2 /A. This Fokker-Planck 
equation can be solved numerically with finite element 
methods with an absorbing boundary condition at x = 0 
to account for clone extinction. The solution, represented 
by the black curve in Fig. HA, matches closely that of 
the full simulated population dynamics (in blue). The 
power-law behaviour is apparent above a transition point 
that depends on the distribution of introduction sizes of 
new clones and the parameters of the model (see below). 
Intuitively, the microscopic details of the noise are not 
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expected to matter when considering long time scales, 
as a consequence of the central limit theorem. How¬ 
ever, the long tails of the distribution of clone sizes in¬ 
volve rare events and belong to the regime of large devia¬ 
tions, for which these microscopic details may be impor¬ 
tant. Therefore, the agreement between the process de¬ 
scribed by the overdamped spring and the exponentially 
decaying, Poisson distributed antigens is not guaranteed, 
and in fact does not hold in all parameter regimes (see 
Fig.fSSj). 

We can further simplify the properties of the noise 
by assuming that its autocorrelation time is small com¬ 
pared to other timescales. This leads to taking the limit 
7 , A —» oo while keeping their ratio constant a = 7 /A con¬ 
stant, so that fi(t) is just a Gaussian white noise with 
= 2a 2 S(t — t') (see Appendix | f| and Fig. S4). 
The corresponding Fokker-Planck equation now reads 


d t p(x, t ) = -f 0 d x p(x, t ) + a d 2 p(x , t) + s(x), (5) 


with s(x) = scS(x — log(Co)). This equation can be 
solved analytically at steady state, and the resulting 
clone size distribution is, for C > Cq: 


P(C) 


s c 1 

CUT 2 C a+1 ’ 


(6) 


with a = \fo\/cr 2 = A|/o|/A 2 (details in Appendix |f| . 
The full solution, represented in Fig. [2jA in red, captures 
well the long-tail behaviour of the clone size distribution 
despite ignoring the temporal correlations of the noise, 
and approaches the solution of the colored-noise model 
(Eq.§ as A, 7 —>• 00 , as expected (see Fig. 

The power law behaviour and its exponent depend on 
the noise intensity, but are otherwise insensitive to the 
precise details of the microscopic noise, including its tem¬ 
poral properties. Fat tails (small a) are expected when 
the average cell lifetime is long (small |/o|) and when the 
antigenic noise is high (large a or A). The explicit ex¬ 
pression for the exponent of the power law 1 + a as a 
function of the biological parameters can be used to in¬ 
fer the antigenic noise strength A 2 directly from data. 
An inverse lifetime of |/o| ~ 10 -3 days -1 of a typical 
T-cell can be estimated as the ratio of thymic output to 
the total population of lymphocytes in the body [27H29] . 
The characteristic lifetime of antigens A -1 is harder to 
estimate, as it corresponds to the turnover time of the 
antigens that the body is exposed to, but is probably of 
the order of days or a few weeks, A ~ 0.1 day -1 . We 
estimated a = 1 ± 0.2 from the zebrafish data of Fig. ek 
muni using canonical methods of power-law exponent 
extraction m (see Appendix [G] for details), and also 
found a similar value in human T cells |3T[. The result¬ 
ing estimate, A = 10 -2 day -1 , is rather striking, as it 
implies that fluctuations in the net clone growth rate, A , 
are much larger than its average /q. 

While the distribution always exhibits a power law 
for large clones, this behavior does not extend to clones 
of arbitrarily small sizes, where the details of the noise 





FIG. 2 : Clone size distributions for populations with fluctu¬ 
ating antigenic, clone-specific fitness. A. Comparison of sim¬ 
ulations and simplified models of clone dynamics. Blue curve: 
cumulative distribution of clone sizes obtained from the simu¬ 
lation of Eq.[l] Black curve: a simplified, numerically solvable 
model of random clone-specific growth, also predicts a power- 
law behaviour. Red curve: analytical solution fo the Gaussian 
white noise model, Eq. [ 4 ] Parameters used: p — 0.98 day -1 , 
v — 1.18 day -1 , A = 2 day -1 , sc — 2000 day -1 , Co = 2, 
sa = 1-96 • 10 7 day -1 . Inset: the exponent is independent of 
the initial clone size. Results from simulation with different 
values of the introduction clone size. The cut-off value of the 
power law behaviour, represented here as a dot, is strongly 
dependent on the value of Co- Parameters are p = 0.2 day -1 , 
v — 0.4 day -1 , A = 2 day -1 , 7 = 1 day -3 / 2 and sc — 5000. 

B. Value of the cumulative distribution function at the point 
of the power law cut-off as a function of the introduction clone 
size Co for different values of a dimensionless parameter re¬ 
lated to the effective strength of antigen fluctuations relative 
to their characteristic lifetime A 3 / 7 2 for a fixed power law 
exponent a. We use the cumulative distribution function be¬ 
cause it is robust, invariant under multiplicative rescaling of 
the clone sizes. This way we do not need to correct directly 
for PCR multiplication or sampling. Parameters are for B 
and C p = 4.491 days -1 , v — 5.489 days -1 and a = —0.998. 

C. Power-law cut-off as a function of the introduction clone 
size. 


and how new clones are introduced matter. We define a 
power-law cut-off (7* as the smallest clone size for which 
the cumulative distribution function (CDF) differs from 
its best power-law fit by less than 10%. Using numeri¬ 
cal solutions to the Fokker-Planck equation associated to 
the colored-noise model, we can draw a map of C* as a 
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function of the parameters of the system. In Fig. [2J3-C 
we show how (7* varies as a function of the introduction 
size for different values of the dimensionless parameter 
related to the effective strength of antigen fluctuations 
relative to their characteristic lifetime at fixed power law 
exponents. In principle one can use this dependency to 
infer effective parameters from data. In practice, when 
dealing with data it is more convenient to consider the 
value of the cumulative distribution at (7*, rather than 
(7* itself. For example, fixing Co = 4 and fitting the 
curve of Fig. EK with our simplified model using A as an 
adjustable parameter, we obtain A ~ 0.14 day -1 (see Ap¬ 
pendix [G|, which corresponds to a characteristic lifetime 
of antigens of around a week. Although this estimate 
must be taken with care, because of possible PCR am¬ 
plification biases plaguing the small clone size end of the 
distribution, the procedure described here can be applied 
generally to any future repertoire sequencing dataset for 
which reliable sequence counts are available. 

C. A model of fluctuating phenotypic fitness 

So far, we have assumed that fitness fluctuations are 
identical for all members of a same clone. However, the 
division and death of lymphocytes do not only depend on 
signaling through their TCR or BCR. For example, cy¬ 
tokines are also growth inducers and homeostatic agents 
[321 33 , and the ability to bind to cytokines depends on 
single-cell properties such as the number of cytokine re¬ 
ceptors on the membrane of a given cell, independent of 
their BCR or TCR receptor. Other stochastic single-cell 
factors may affect cell division and death. These sig¬ 
nals and factors are cell specific, as opposed to the clone 
specific properties related to BCR or TCR binding. To¬ 
gether, they define a global phenotypic state of the cell 
that determines its time-varying “fitness,” independent 
of the clone and its T-cell or B-cell receptor. This does 
not mean that these phenotypic fitness fluctuations are 
independent across the cells belonging to the same clone. 
Cells within a clone share a common ancestry, and may 
have inherited some phenotypic properties of their com¬ 
mon ancestors, making their fitnesses effectively corre¬ 
lated with each other. However, this phenotypic mem¬ 
ory gets lost over time, unlike fitness effects mediated by 
antigen-specific receptors. 

We account for these phenotypic fitness fluctuations 
by a function f c {t) quantifying how much the fitness of 
an individual cell c differs from the average fitness /o- 
This fitness difference is assumed to be partially herita¬ 
ble, which we model by: 

^ =-A f c (t)+ V2j c rj c (t), (7) 

where A -1 is the heritability, or the typical time over 
which the fitness-determining trait is inherited, y c quan¬ 
tifies the variability of the fitness trait, and rj c (t) is a 
cell-specific Gaussian white noise of power 1. Despite its 


formal equivalence with Eq. [3] it is important to note 
that here the fitness dynamics occurs at the level of the 
single cell (and its offspring) instead of the entire clone. 
The dynamics of the fitness fi(t ) of a given clone i can 
be approximated from Eq. [7] by averaging the fitnesses 
f c (t ) of cells in that clone, yielding: 

dC- 

T = ifo+mmt) + (8) 

t = ~ A/<< * )+ T' /W) ’ (9) 

where rji(t) and &(t) are clone-specific white noise of in¬ 
tensity 1, and v and p are the average birth and death 
rates, respectively, so that fo = v—fi (details in Appendix 
lib . The difference with Eq.pl is the 1 / y/Ci(t) prefactor in 
the fitness noise rji(t), whicn stems from the averaging of 
that noise over all cells in the clone, by virtue of the law 
of large numbers. Because of this prefactor, the fitness 
noise is now of the same order of magnitude as the birth- 
death noise, which must now be fully taken into account. 
Taking Eq. [8] and Eq. [9] at the population level gives 
a Fokker-Planck equation with a source term account¬ 
ing for the import of new clones. We verify the numeri¬ 
cal steady state Fokker-Planck solution against Gillespie 
simulations (Fig[S6j see Appendix [Hi for details). 

Fig. [3]A.-B show the distribution of clone sizes for dif¬ 
ferent values of the phenotypic relaxation rate A and en¬ 
vironment amplitude y c . These distributions vary from 
a sharp exponential drop in the case of low heritabil- 
ity (large A) to heavier tails in the case of long con¬ 
served cell states (small A). To quantify the extend 
to which these distributions can be described as heavy¬ 
tailed, we fit them to a power law with exponential cut¬ 
off, p(C) oc C~ 1 ~ a e~ c / Crn , where C m is the value below 
which the distribution could be interpreted as an (imper¬ 
fect) power law. Fig. I T shows a strong dependency of 
this cut-off with the phenotypic memory A -1 . The longer 
the phenotypic memory A -1 , the more clone-specific the 
fitness looks like, and the more the distribution can be 
mistaken for a power law in a finite-size experimental 
distribution. Larger birth-death noise also extends the 
range of validity of the power-law. As a result, and de¬ 
spite the absence of a true power-law behaviour, these 
models of fluctuating phenotypic fitnesses cannot be dis¬ 
carded based on current experimental data. 

The model can be solved exactly at the two extremes 
of the herit ability parameter A. In the limit of infinite 
heritability (A 0) the system is governed by selec¬ 
tive sweeps. The clone with the largest fitness com¬ 
pletely dominates the population, until it is replaced 
by a better one, giving rise to a trivial clone-size dis¬ 
tribution. In the opposite limit, when heritability goes 
to 0 (A Toe), the Fokker-Planck equation can be 
solved analytically (see Appendices [T and [jJ, yielding 
an exact power-law with exponential cutoff, p{C) oc 
C _1_a e _c / Cm , with a = — [1 T (/i T z/)A 2 /2y^] 1 and 
C m = {p — iy )~ 1 [{p + dv)/2 T 7c/A 2 ]. The numerical so¬ 
lution of Fig. [3^3 is close to this limit. Note that even 
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FIG. 3: Clone size distributions for populations with a cell 
specific fluctuating phenotypic fitness. A. Cumulative dis¬ 
tribution of clone sizes for moderate phenotypic heritability 
(A -1 ). The distribution is power-law like for small clone val¬ 
ues and drops above a cut-off around 0.01 of clone size prob¬ 
ability. An experiment that does not sequence the repertoire 
deeply enough could report a power law behavior (see zoom). 
Parameters are /x — 0.17 days -1 , v — 0.3 day -1 , A — 0.4 
days -1 and y c = 0.5 days -3 / 2 . Co = 2 for all three graphs. 
B. An example of a distribution of clone sizes from a cell- 
specific model with very low environmental noise, close to the 
pure birth-death limit. The distribution is flat (a = 0) and 
then drops exponentially. It does not resemble experimen¬ 
tal data. Parameters are /x — 0.1 days -1 , v — 0.3 days -1 , 
A = 2 days -1 and y c = 5 days -3 / 2 . C. Value of the cu¬ 
mulative distribution at the exponential cut-off as a function 
of the speed of environment variations A, for different birth- 
death noise levels. Parameters are /o = —0.998 days -1 and 
/ 0 A 2 / 7c 2 = 0.998. 


with a negligible exponential cutoff, the predicted a < 0 
contradicts experimental observations. 


III. DISCUSSION 

Clone size distributions from high throughput data 
have been available for a few years and hold great promise 
for understanding the processes that shape the composi¬ 
tion of the repertoire. Yet their interpretation requires 
models. The fate of lymphocytes making up immune 
repertoires is governed by a wide variety of mechanisms 
and pathways that vary across cell types and species. 
Previous population dynamics approaches to repertoire 
evolution have taken great care in precisely modeling 
these processes for each compartment of the population, 
through the various mechanisms by which cells grow, 
die and change phenotype [I2j ECU EH- However, one 


of the most striking properties of repertoire statistics re¬ 
vealed by high-throughput sequencing—the observation 
of power laws in clone size distribution—hold true for var¬ 
ious species (human, mice, zebrafish), cell type (B and 
T cells) and subsets (naive and memory, CD4 and CD8), 
and seems to be insensitive to these context-dependent 
details. These observations call for a stochastic descrip¬ 
tion of the various properties affecting cell fate, and their 
ubiquity points to some universal features of the dynam¬ 
ics. 

The model introduced in this paper describes the 
stochastic nature of the immune dynamics with a min¬ 
imal number of parameters, easing the numerical and 
analytical treatment and helping interpret the different 
regimes. These parameters are effective in the sense that 
they integrate different levels of signaling, pathways, and 
mechanisms. We assumed that they are general enough 
that different cell types (B and T cells) or subsets can 
be described by the same dynamical equations. Most of 
these parameters have natural interpretations: the typi¬ 
cal lifespan of an infection or antigen stimulation is given 
by A -1 , and the effect of these stimulations by A. For 
memory cells, A -1 corresponds to the duration of an in¬ 
fection. For naive cells, the interpretation is less clear, 
as non-pathogenic antigens such as self-antigens may not 
be as transient as pathogens; in this case A -1 could cor¬ 
respond to the characteristic timescale of changes in a 
clone’s micro-environment. Likewise, in the context of 
cell-specific noise, the parameters A -1 and y c correspond 
to the correlation time and variability of the cell pheno¬ 
type. All these parameters may vary greatly across cell 
types and species. 

We showed that the relevant sources of stochasticity 
for the shape of the clone-size distributions fall into two 
main categories, depending on how cell fate is affected by 
the environment. Either the stochastic elements of clone 
growth act in a clone-specific way, through their receptor 
(BCR or TCR), leading to power-law distributions with 
exponent > 1, or in a cell-specific way, e.g. through their 
variable level of sensitivity to cytokines (and more gener¬ 
ally through any phenotypic trait affecting cell fitness), 
leading to exponentially decaying distributions with a 
power-law prefactor. These two types of signals (clone 
specific and cell specific) are important for the somatic 
evolution of the immune system [201 El E1I35H33 and 
our analysis shows that the shape of the clone size dis¬ 
tribution is informative of their relative importance to 
the repertoire dynamics. It provides a first theoretical 
setting and an initial systematic classification for model¬ 
ing immune repertoire dynamics. Our method applied to 
high-throughput sequencing data can be used to quantify 
how much each type of signal contributes to the over¬ 
all dynamics, and what is the driving force for the dif¬ 
ferent cell subsets. For example, although it is reason¬ 
able to speculate that clone-specific signals should domi¬ 
nate for memory cells (through antigen recognition), and 
cell-specific selection for naive cells (through cytokine- 
mediated homeostatic division), the relative importance 
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of these signals for both cell types is yet to be precisely 
quantified, and may vary across species. A clear power 
law over several decades would strongly hint at dynamics 
dominated by interactions with antigens, while a faster 
decaying distribution would favor a scenario where indi¬ 
vidual cell fitness fluctuations dominate. Applying these 
methods to data from memory cells can give orders of 
magnitude for the division and half-life of memory lym¬ 
phocytes, as well as the typical number of cells Co from 
a clone that are stored as memory following an infection. 

The application of our method to data from the first 
immune repertoire survey (B cell receptors in zebrafish 
[2]) suggests that clone-specific noise dominates in that 
case, allowing us to infer a relation between the dynami¬ 
cal parameters of the model from the observed power- 
law exponent ~ 2. However, there are a few issues 
with applying our method directly to data in the cur¬ 
rent state of the experiments. First, the counts (i.e. how 
many cells have the same receptor sequence and belong 
to the same clone) from many high-throughput reper¬ 
toire sequencing experiments are imperfect because of 
PCR bias and sampling problems. New methods using 
single-molecule barcoding have been developed for RNA 
sequencing nnnim, but they do not solve the problem 
entirely, as the number of expressed mRNA molecules 
may not faithfully represent the cell numbers because 
of possible expression bias. In addition, most studies 
(with the exception of |40]) have been sequencing only 
one of the two chains of lymphocyte receptors, which is 
insufficient to determine clone identity unambiguously. 
As methods improve, however, our model can be applied 
to future data to distinguish different sources of fitness 
stochasticitiy and to put reliable constraints on biological 
parameters. 

Thanks to its generality, our model is also relevant 
beyond its immunological context, and follows previous 
attempts to explain power laws in other fields mm- 
The dynamics described here corresponds to a general¬ 
ization of the neutral model of population genetics [44] 
where thymic or bone marrow outputs are now rein- 
tepreted as new mutations or speciations, and where we 
have added a genotypic or phenotypic fitness noise (re¬ 
ceptor or cell-specific noise, respectively). It was recently 
shown that such genotypic fitness noise strongly affects 
the fixation probability and time in a population of two 
alleles maun. Our main result (Eq. |6| shows how fit¬ 
ness noise can cause the clone-size distribution (called 
frequency spectrum in the context of population genet¬ 
ics) to follow a power law with an arbitrary exponent > 1 
in a population of fixed size, while the classical neutral 
model gives a power law of exponent 1 with an exponen¬ 
tial cut off (as shown in our exact solution with y c = 0). 
Our results can be used to explain complex allele fre¬ 
quency spectra using fluctuating fitness landscapes. 



FIG. SI: We compare results from a full Gillespie simulation 
(blue crosses) of a system with only birth-death dynamics 
wit h an alytical prediction for a discrete system (bl ack cr osses, 
Eq. |A3| ) and a continuous system (red curve, Eq. |A12 ). The 
prediction with discrete variables is more accurate for small 
clones but the behaviour of all systems is the same for large 
populations. The parameters are y — 1.45 day -1 , v t= 1.5 
day -1 , Co = 2 and we introduce 2000 new clones per day. 


Appendix A: Simple birth-death process with no 
fitness fluctuations, and its continuous limit 

In this Appendix we derive the steady-state clone size 
distribution for a system that does not experience any 
environmental stimulation or noise, but is governed by a 
birth death process. We will show that the small number 
fluctuations arising from the discrete nature of birth and 
death are not sufficient to explain the observed distribu¬ 
tions. We also show that our choice of a continuous birth 
death process is equivalent to its discrete version. 

The multiplicative birth-death process corresponds to 
the following discrete dynamics: 

{ P(n -> n + 1) = jindt ( . 

P(n n — 1) = vndt, ^ 

where /i is the division rate, v the death rate. We assume 
that the population of cells of size n is maintained out 
of equilibrium by a source of new cells. The steady state 
solution for cell numbers above the value of the source 
satisfies detailed balance 

P(n)/m = P(n + l)v(n + 1) (A2) 

and, assuming the death rate is larger than the birth rate, 
takes the form 
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P(n) ~ 


K 

_ e ~n logi///x 
n 


(A3) 







The continuous counterpart of this discrete stochastic 
process corresponds to the following linear-noise approx¬ 
imation: 


d t Ci = foCi + y/(n + v)C& (A4) 


where = S(t — t') and fo = /j J — u<0 (and 

we use the Ito convention ). In terms of x = log C the 
Langevin equation is 

d t x = f 0 + VJT+^e- x ' 2 i - e ~ x , (A5) 

and the corresponding Fokker-Planck equation reads 


dtP = d x {-fop)+dl(^^e x pj+d x (e X p^^j+s(x), 

(A6) 

where s(x) is the distribution of sizes of newly arriving 
clones. At steady state, we find 

K - s c 0(x - :ro) = -f 0 p + -y - e~ x p', (A7) 

where K is an integration constant. Defining 

C m = (/^ + z/ )/(2|/o|) (A8) 


for x < xq we obtain 


p[x) = r e x e eX /°- = KC m ( 

Jo 

(A9) 

and for x > Xq 

p(x) = e~ eX ! Cm C m Ke eX /Cm - Ke 1/Cm (A10) 

__^_ p e x /Cm , __^C_ e x ° /C m ‘ 
l/ol Cm l/ol Cm \ 

To ensure convergence we set K = sc/(\fo\C m ) and the 
steady solution of the Fokker-Planck equation is 


Appendix B: Effects of explicit global homeostasis 


In the simulations of clone dynamics in a fluctuating 
environment presented in the “Clone dynamics in a fluc¬ 
tuating antigenic landscape” Results section of the main 
text, we did not explicitly include a homeostatic control 
term, but tuned the division and death rates to achieve 
a given repertoire size. Here we add an explicit homeo¬ 
static term to the growth and degradation terms in the 
Langevin simulations described by Eq. 1 of the main text 


— h 


rE i qi r 

N 


(Bl) 


where TV is a carrying capacity, h is the homeostatic con¬ 
stant multiplicator and r is the exponent of homeostatic 
response that described the sharpness of the response 
when approaching then carrying capacity limit. Com¬ 
paring in Fig. [S2] the resulting clone size distribution 
obtained with the explicit homeostatic term to the dis¬ 
tribution from the simulations in the main text, we see 
that the explicit homeostatic term does not have an effect 
on the form of the distribution. It does have an effect on 
the trajectory of certain clones, and in particular on the 
response of the system to a very large invasion, making 
it an important feature of the dynamics of the immune 
system. However, as shown by the results in Fig. S2 its 
net effect on the clone size distribution can be taken into 
account by tuning division and death. When consider¬ 
ing specific trajectories in the mean field approximation 
homeostatic control will add a systematic negative drift 
to the clonal population and can be accounted for by an 
additional contribution to /q. 


Appendix C: Details of noise partition do not 
influence the clone size distribution function 


, ^ ff&(l- e “ (e *" 1)/Cm ) >if*<*o 

pyx) — < e x ° /Cm C~ 1 \ -e x /Cm ‘f \ 

l boF e — e rn > e ’ ^ x > x o 

(All) 

or in terms of the clone size 


p(C) = 


i(l _ e -(c-i)/c m ) ) if c < Co 
{e Co/C m _ e C^ ) e_Z^l /liC>Co 


(A12) 


This result is exactly equivalent to that of Eq. |A3| when 
v—fi = |/o| /i, v. The accuracy of the approximation is 

verified in Fig. [Sl| Even for very large exponential cutoff 
values, C m , the apparent exponent is a = 0, correspond¬ 
ing to a flat cumulative distribution. This distribution is 
inconsistent with experiments, regardless of sequencing 
depth and we conclude that pure birth-death noise is not 
sufficient to explain the observed distributions. 


In the simulation of the dynamics of receptors expe¬ 
riencing a clone-specific fitness presented in the “Clone 
dynamics in a fluctuating antigenic landscape” Results 
section of the main text we distributed the noise be¬ 
tween the different random distributions: the poisson 
distributed number of new antigens (sa), the variance of 
the initial concentrations (ayo) and the variance of the 
binding probability (the values of Kij ). We made specific 
choices for this reparation by picking specific parameters 
of the random processes. Here we show that these specific 
choices of repartitioning the contributions to the noise do 
not influence the clone size distributions. Fig. [S3] com¬ 
pares clone size distributions obtained with different val¬ 
ues of the poisson distributed number of newly arriving 
antigen N a and the variance of the Gaussian distributed 
binding probabilities , reproducing the same distribu¬ 
tions in both cases. 
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FIG. S2: Adding an explicit homeostatic control term does 
not affect the clone size distribution compared to tuning the 
degradation and death rates to obtain a given repertoire size 
as is done in the main text. Comparison of the clone size 
distribution with an explicit homeostatic control term given 
(black line) to the distribution presented in the 


B1 


by Eq. _ 

main text (red line). We simulate the Langevin equation for 
a division rate p — 0.2 days -1 , death rate v — 0.4 days -1 , 
introduction size Co = 2, environmental correlation time of 
A -1 as 0.5 days and an amplitude of variations of the environ¬ 
ment A = 1.41 days -1 without any homeostatic control for 
the red curve and with carrying capacity TV = 4 • 10 10 {h — 1) 
and a homeostatic exponent r — 3 for the black curve. 


Appendix D: Model of temporally correlated 
clone-specific fitness fluctuations 

In the “Simplified models and the origin of the power 
law” Results section of the main text we make a series 
of approximations to effectively describe the dynamics of 
immune cells: we first approximate the antigenic environ¬ 
ment by a random process with time correlated (colored) 
noise and we later neglect these temporal correlations. 
In this section and Appendix [F] we give the details that 
lead to the specific forms of the effective equations. In 
this Appendix we derive the Fokker-Planck equations for 
the time correlated noise model. In Appendix |F| we will 
consider the limit of an infinitely quickly changing envi¬ 
ronment. 

The Langevin equations describing the dynamics of 
cells experiencing clone specific fitness fluctuations with 
a finite correlation time are 
An. 

= t/o+ mwm + pi) 

^ = -xMt) + V2m(t), (D2) 


FIG. S3: Repartitioning the sources of stochasticity between 
the number of new antigens per time unit or the variability of 
binding probabilities does not influence the clone size distri¬ 
butions. We compare simulations of the full system dynamics 
defined by Eq. 1 of the main text with two sets of values 
sa of the poisson distributed number of newly arriving anti¬ 
gen N a and the variance of the Gaussian distributed binding 
probabilities Kij that give the same total environmental noise 
A 2 = SApa‘o(K 2 )X~ 1 . The parameters were taken to be (as 
in Fig. 1) sc = 2000 day -1 , Co = 2, day -1 , ayo = a 0 — 1, 
A = 2 day -1 , p = 10 -7 , v — 0.98 day -1 , p — 1.18 day -1 . 
For the red curve the variance of the entries of Kij is 1, so 
that (K 2 ) = 2 and sa = 1.96 • 10 7 while for the black curve 
the variance of the entries of Kij is 3, so that (K 2 ) = 4, and 
SA = 0.98 • 10 7 . 


genic environment. The autocorrelation function of this 
Ornstein-Uhlenbeck process is 

(. mfi(t ')> = e- A ( t+t ') (</i(0) 2 > - x) + 

(D3) 

We pick the steady-state value of t he i nitial fitness dis¬ 
tribution to cancel the first in Eq. |D3[ (/i(0) 2 ) = y 2 /A 
and obtain 

(D4) 

(conditioned on the integral of the net growth rate / + /o 
being positive so that the clone does not go extinct). Set¬ 
ting x = log C, we obtain a new set of Langevin equations 

d t Xi = f 0 + fi + Vm + veT Xil2 ii ~ e~ Xi — ^ — , (D5) 
^ = —A fi + (D6) 


where = 5{t — t r ) represents birth death noise 

in the linear-noise approximation (with the Ito conven¬ 
tion) and = S(t — t') is the noise of anti- 


where the birth-death noise is now treated in the Ito 
convention. The corresponding Fokker-Planck equation 
for the distribution of fitness and clone size at time £, 
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p(x, /, £), verifies 

dtp = d x (—fop) + df(Xfp) + 9/(7 2 p) + 


(D7) 


p + v _ T . 0 

—-— e p ) + d x 


_ p-\-v 

e X P 


+s(^/), 


where s(x, f ) is the source of new clones. We solve this 
equation numerically using finite element methods to ob¬ 
tain clone size distributions for the clone-specific fitness 
model. 


Appendix E: The Ornstein Uhlenbeck process and 
maximum entropy 


In this Appendix we show that the maximum en¬ 
tropy or maximum caliber process with autocorrelation 
function (x(t)x(t + s)) = A 2 e~ x ^ corresponds to the 
Ornstein-Uhlenbeck process. We consider this continu¬ 
ous maximum entropy process as the continuous limit 
of a simpler maximum entropy system in discrete time. 
Burg’s maximum entropy theorem m states that the 
maximum entropy process in discrete time that con¬ 
strains (X n (t) 2 ) = A 2 and (X n (t)X n +i(t)) = A 2 e~ Xr 
corresponds to the following Markovian dynamics: 

X n+1 = e~ XT X n + v/l - e-^An, (El) 


where p is Gaussian white noise. In the limit of r 0 
we recover the constrained autocorrelation function in 
the vicinity of s = 0 + : ( x(t) 2 ) = A 2 , (d/ds)(x(t)x(t + 
S ))ls=0+ = —A A 2 , and Eq. |El| converges to an Ornstein- 
Uhlenbeck process. 


Appendix F: Model solution for white-noise 
clone-specific fitness fluctuations 


In the limit of infinitely quickly fluctuating environ¬ 
ments, 7 +oo and A Too while keeping their ra¬ 
tio a — 7 /A constant, the autocorrelation of the fitness 
noise approaches a Dirac delta function, and the fluctu¬ 
ating part of the growth rate fi(t) converges to Gaussian 
white noise, = 2<j 2 S(t — t'). Effectively the 

immune cell dynamics are now described by a one dimen¬ 
sional Langevin equation for the clone size 

dtCi = f 0 Ci + V2aCir]i + 7(i / + n)Ci(t)£i, (FI) 


where {pi(t)pi(t')) = S(t — t') follows the Stratanovich 
convention and ^ is as before. The equation for the log¬ 
arithm of the clone size x = log C is 

d t Xi = f 0 +V2ar)i+^jI + ve~ Xi/2 t >i -e~ Xi — ^ (F 2 ) 


We explicitly checked that the numerical solution to 

con¬ 


the clone specific fitness model in Eqs. D1 and D2 


verged to the dynamics described by Eq. FI, as demon¬ 
strated in Fig. [S~T| 



FIG. S4: Comparison between clone size distribution obtained 
as solutions of the time-correlated and time-uncorrelated 
noise models (without birth death noise). As the values of 
the dimensionless parameter related to the effective strength 
of antigen fluctuations relative to their characteristic lifetime 
A 3 /7 2 grow the time correlated noise prediction converges to 
the exact power-law solution of the white-noise model. The 
cut-off value of the power law decreases with A 3 /y 2 . All sim¬ 
ulations performed at a constant value of a = |/o|A 2 / 7 2 set 
to 0.5. The value of /o is kept fixed to —0.5 days -1 for all 
solutions. 


We now solve this equation analytically, starting with 
the case of no birth-death noise: Eq. |F1| simplifies to 

dtCi = foCi + V^aCiVi (F3) 

The equation for x = log C (using the Stratanovich con¬ 
vention) is 

d t Xi = fo + V2arn, (F4) 

with the corresponding Fokker Planck equation 

d t p(x,t) = d x (-f 0 p) + I<9z[2<t 2 d x p\ + s(x), (F5) 

where s(x) is the source term describing the size of newly 
introduced clones. Assuming a constant initial clone size, 
s(x) = scS(x — £ 0 ), the steady state solution is 

p(x) = e~ ax - \Ke ax -K- s c a 2 e ax + s c a 2 e x ° 1 , 
a L J 

(F 6 ) 

where we have defined 


« = l/ol/cr 2 , (F7) 

and K is an integration constant. Imposing that p van¬ 
ishes at infinity sets K = sccr 2 and the final form of the 
steady state clone size distribution is 


p{x) 


ffy(l-e~ ax ) ifx<x 0 

S£_ e -ax ( e x 0 _ 2 ) if x > Xq, 


(F 8 ) 
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Assuming that the initial size is constant, the steady state 
solution is given by the solution of the inhomogeneous 
linear equation: 

K - s c e{x - so) = -fop + v 2 p' + e~ x ^^p'. (F12) 

The full solution is the sum p = p 0 + pi of the particular 
solution, 


Po(x) 


(F13) 


and the solution pi to the homogeneous equation 

fopi = 0 V 1 + (F14) 


of solution: 


FIG. S5: We compare simulations of the Langevin dynamics 
with time correlated antigenic noise with birth-death noise 
(black line) to the same dynamics without the birth-death 
noise (red line). All other parameters are kept fixed.We find 
similar values of the power law exponents but different small 
clone behaviours. The parameters are p = 0.2 day -1 , v — 0.4 
day -1 (for red curve simply /o — —0.2 day -1 ) , Co = 2, A = 2 
day -1 and 7 = 1 day -3 / 2 


Pi(x) = K' 


r x I (A+G 1 a 
e + 2a 2 

1 1 (m+G 
1 + 2 cr 2 J 


with a = |/o|/cr 2 . Therefore, for x > xq 


p(x) = K' 


r x I (A+G 1 a 

e + 2 a 2 

1 I (m+G 
, 1 + 2a 2 . 


K-s 

~W 


(F15) 


(F16) 


or in terms of clone size C = e®, 


we set K = s for convergence and obtain the steady state 
clone size distribution for large x 


Pip) 


(1-^) if C < Co 


\fo\C <j<*j “ - ' - - /pgN 

l^f c^+1 (c^q — -*-) hC>Co- 


In all simulations and solutions we find that for large 
clones, the model of temporally correlated fitness fluctua¬ 
tions behaves as the its white noise limit. This behaviour 
can be explained by the fact that large clones need a long 
time to become large. At these long timescales, the char¬ 
acteristic time of noise correlation is negligible and the 
noise may be approximated as white. For this reason, 
the exponent a of the power law computed assuming a 
white noise for the fitness fluctuations is still valid even 
when that noise is actually correlated in time. 

Next, we re-introduce the birth-death noise and solve 
the general equation. The Langevin equation for x = 
l°g C, 


p{x) = 


e x + 


P + V 


2 cr 2 


or in terms of the clone size 


p{C) = 


c{c+^y 


(F17) 


(F18) 


We see that the white noise solution with birth-death 
noise has the same large clone power law behaviour as 
without birth-death noise. Fig. [S5] illustrates how birth 
death noise in the clone-specific fitness models with time 
correlated noise also does not affect the power law expo¬ 
nent but only the cut off of the power law. 


Appendix G: Data analysis 


d t x = f 0 + V2<rr, + - e -^ ( fio ) 

results in the Fokker-Planck equation for the distribution 
of clone sizes 

dtp = d x (-f 0 p) + ^d x [2a 2 d x p] + d 2 x {^~Y~ e ~ x pj 

+d x (^ °~ x p + s ( x )- 

(Fll) 


In the main text we report values of the power law 
exponents and power law cut off values obtained from 
the high throughput sequencing repertoire study of clone 
size distributions of zebrafish B-cell heavy chain recep¬ 
tors of Weinstein et al. [2]. We extracted the power law 
exponent and the best fit for the starting point of the 
power law, defined as its lower bound cutoff, from the 
discrete clone size distributions plotted in Fig. 1 of the 
main text using the methods discussed by Clauset and 
Newman [30] . Specifically, for each point of the cumu¬ 
lative clone size distribution we compute an estimate of 
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the power law exponent with that point as cutoff (i.e the 
best fit of the power law including only the values of the 
distribution above that point) using 


^(^min) — 1+77/ 


+ lQ g 

_i= 1 



(Gl) 


where C m i n is the cut off and n is the number of points 
with y-axis values above C m [ n . For each of these cut-off 
values we compute the Kolmogorov-Smirnov distance be¬ 
tween the data and the estimated power law distribution: 


d(C m[n ) = rnax c>Cmin |F d (C) - F e (C ; C min )| (G 2 ) 


where the maximum is taken over all values above the 
cut off Cmin, Fd is the cumulative distribution function 
(CDF) of the data and F e (C;C m [ n ) is the CDF of the 
estimated power law distribution with C m [ n as a cutoff, 
using Eq. |G1| The the cut off is taken to be the minimum 
of this distance over all possible cut off values and the 
exponent is the exponent found for this value. 

The obtained power law parameters are presented in 
Table [l| The power law exponent gives reproducible val¬ 
ues for different individuals and agrees with values of the 
same exponent obtained from human data [31 . We note 
that the power law exponent of the cumulative distri¬ 
bution function is a for a power law distribution with 
exponent 1 + a. As discussed in detail in the main text, 
the reliability of the cutoff estimate C* is sensitive to ex¬ 
perimental precision of capturing the rare clones. In the 
presented dataset the reads were not barcoded and the 
counts had to be renormalized by a known PCR amplifi¬ 
cation factor. Therefore, these normalized counts could 
not to used as normal counts, making the definition of 
a cut-off clone size problematic. To overcome this prob¬ 
lem, we estimate the power law cut-off from the value of 
the cumulative distribution function at the cut-off clone 
size (instead of the cut-off clone size itself). That value 
is invariant under rescaling of absolute clone size values, 
unlike C*. 

We notice that the steady state solution is invariant 
under a full rescaling of time in the equations of the dy¬ 
namics. This means that the system can be described by 
two dimensionless parameters, a = /oA 2 / 7 2 and A 3 /y 2 , 
and the introduction size Co- Fitting a to data and as¬ 
suming value for Co, we can compare the value of the 
power law cut-off in data and in simulations to fit the 
remaining dimensionless parameter, A 3 / 7 2 . Estimating 
/o based on thymic output we can predict the order of 
magnitude of A and 7 . 


Appendix H: Cell specific simulations 

In the “A model of fluctuating phenotypic fitness” 
Results section of the main text, we present results of 
Fokker-Planck simulations for the cells dynamics. Here 
we verify that the stochastic dynamics of cells subject to 


Fish 

1 + 0 

C* 

log(l - CDF(C*)) 

A 

2.0591 

32.6445 

- 3.1389 

B 

2.0214 

10.7231 

-1.8644 

C 

2.0708 

16.7386 

-2.4655 

D 

2.0670 

14.9313 

-2.1492 

E 

2.0529 

8.2685 

-1.8332 

F 

2.0006 

5.8972 

-1.6161 

G 

1.9867 

52.2909 

-2.7329 

H 

2.2242 

32.1719 

-2.6877 

I 

2.0835 

18.4385 

-2.2757 

J 

1.6907 

44.4885 

-2.2877 

K 

1.7641 

3.6030 

-0.9907 

L 

1.9417 

18.5298 

-2.2730 

M 

1.9901 

18.5531 

-2.2031 

N 

1.8877 

108.4732 

-2.7984 


TABLE I: Fit of the power law exponent of the clone size 
distribution 1 + a and power law cut-off value C* for zebrafish 
B-cell heavy chain D segment data from Weinstein et al [2] 
presented in Fig. 1. The fit for 14 fish (named A to N) shows 
a similar fit of the power law exponent. 


a fluctuating cell-specific fitness are well approximated at 
the population level by a Fokker-Planck equation with a 
source term accounting for the import of new clones by 
comparing its numerical steady-state solution obtained 
by a finite elements method to explicit Gillespie simu¬ 
lations. We simulated the dynamics of clones using a 
Gillespie algorithm where cell division and death are ac¬ 
counted for explicitly and depend linearly on a fitness 
f c (t) fluctuating according to Eq. 7. The death rate is 
kept constant (above the average birth rate) and the fluc¬ 
tuations of the fitness only affect the birth rate (with the 
constraint that the birth rate is always positive). The 
agreement between the results of this detailed simulation 
and the Fokker-Planck solution, shown in Fig. |S 6 [ vali¬ 
dates the linear-noise approximation for the birth-death 
noise as well as the averaging argument leading to Eq. 8 
and 9. This allows us to rely on the Fokker-Planck solu¬ 
tion to explore parameter space. 


Appendix I: Model of cell-specific fitness 
fluctuations, and its limit of no heritability 

The cell specific fitness model described in the “A 
model of fluctuating phenotypic fitness” Results section 
of the main text arises as a description of a population 
where each cell experiences its own growth fluctuations 
but cells deriving from the same lineage remain corre¬ 
lated. In this Appendix we derive the equations that 
describe the dynamics of clones in this system. 

Each cell c experiences a time-correlated multiplicative 
noise from environmental growth factors. For cells j in a 
given cell lineage (or clone) i, each individual cell’s fitness 









13 



FIG. S6: Comparison of the Fokker-Planck solution (red line) 
and explicit Gillespie simulations of the dynamics (blue line) 
for the cell specific fitness model discussed in the “A model 
of fluctuating phenotypic fitness” Results section of the main 
text, show good agreement allowing us to use the population 
level Fokker-Planck solution to explore parameter space. Pa¬ 
rameters were taken to be fi = 0.5 day -1 , v — 0.8 day -1 , 
Co = 2, A = 4 days -1 and y c = 4 day -3 / 2 . 



follows the stochastic dynamics: 

d t fc(t ) = -Xf c + V2-y c r] c (II) 

where ( rj c (t)rj c (t')) = — Averaging over all cells in 

the clone, we obtain 


dtCi — foCi + fjCi + y/(/i + v)C£i 

d t fi = -A fi + J yrlcVi, 
where fi is the average fitness in clone i 

fi(t ) = 


( 12 ) 


( 13 ) 


cei 


a nd where we have added a birth-death noise term 
y/(/i + v)Ci£i. We use the Ito convention for the birth- 
death noise, (£i(t)£i(t / )) = and the Stratanovich 

one for the environmental noise (rji(t)r}i(t f )) = 5(t — t'). 
The equivalent equations for x = log C are 

dtXi = /o + Zi + VM+Pe-^-e-**^ (14) 
dtfi = -A fi + v / 2e _ * i/2 7 c r? i (15) 


and the Fokker-Planck equation is 


d t p(t, x, /) = - (/o + f)d x p + A df(fp) + e *7 c d)p 


p + v, 


+ '-^d x (e-*p) + ^dZ(e-*p) 


p + v, 


+ s(x, /), 


FIG. S7: Varying the dimensionless parameter related to the 
effective strength of antigen fluctuations relative to their char¬ 
acteristic lifetime A 3 /y 2 does not affect the exponent of the 
power law if the ratio between exponential decay A and stan¬ 
dard deviation of the variation 7 is kept constant. For all 
three curves the exponent is a = 0.8 and /x = 0.5 days -1 , 
v — 0.8 days -1 , Co = 2 while A and 7 vary. 


where s(x, /) is the joint distribution of size and fitness 
or newly arriving clones (from thymic or bone marrow 
output). This is the full Fokker-Planck equation that 
is solved numerically in the main text using the finite 
elements method. 

Because of the 1 /y/Cl prefactor in front of the noise 
term, we could expect fitness fluctuations to behave like a 
birth-death noise in the limit of low heritability (A —>■ 00 ). 
In the remainder of this Appendix we show that this is 
not the case, and we show how to take the limit of no 
heritability properly. 

Consider the limit of A 00 and q c 00 , keeping the 
ratio 7 c/A constant, so that / does not become infinites¬ 
imally small. The equation for the environmental stimu¬ 
lation / in x = log C space is given by (in Stratanovich 
convention) 


( 16 ) 


d t f = -A/ + V2^ c e x/2 r). 


( 17 ) 
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where K(t) is a constant to control the variations of the 
integral of £ with probability e (where K(t , e) = 4> -1 (l — 
|) with <f> the CDF of a standard Gaussian distribution). 
The second sub-integral is 

rk/x 

\/2 7c / e~ Xu e~ x( - t ~ u ^ 2 r)(t - u)du 

Jo (Ill) 

^ e~ x J~^ 2 r](t)V 2^(1-e- fe ). 

A 

We choose k = y/\ and in the limit of A oo and 
7 c —>> oo keeping 7 C /A = const we obtain the final form 
of environmental fluctuations 

m —► ( 112 ) 

where £ - means the left-hand limit, /(£) depends only 
on the past, which means that in x = logC space the 
noise is similar to a birth-death noise in the Ito conven¬ 
tion. Yet in terms of clone sizes C additional Ito terms 
make the effect of environmental fluctuations different 
from classical birth-death dynamics. 


Appendix J: Model solutions for cell-specific fitness 
fluctuations in the limit of no heritability 


FIG. S8: Large deviations can influence the effect of Poisson 
noise on the simulated clone size distributions and create a 
discrepancy between Poisson noise (red line) and the Gaussian 
approximations (black line) we assume in the main text. The 
discrepancy is most apparent for small clones. We simulated 
the Langevin dynamics of the Gaussian model with /x = 0.5 
day -1 , v ml day -1 , Co = 2 , A = 3 day -1 and 7=1 day -3 / 2 
and the same dynamics with Poisson noise and /x = 0.5 day -1 , 
v — 1 day -1 , Co — 2 , A = 3 day -1 and sa — 10 7 day -1 . In 
both cases we introduce sc — 2000 new clones per day. 


Direct integration gives 

f{t) = \/ 2 7c [ e- Xu e~ x J- u ^ 2 ri{t - u)du (18) 

Jo 

and we divide the integral into two sub-integrals for k > 0 

f(t) = \/ 2 7c [ e- Xu e~ x ^- u ^ 2 r]it - u)du 
Jk/X 


Ik /A 
r k/X 


nK/A 

+V / 2 7c / e -Au e -x(t-«)/2^ ^ 

Jo 


(19) 


With infinite precision, for any value of £, we set the 
integral of 7 to be bounded and obtain the first integral 
is with probability 1 — e smaller in norm than 


-k 


( 110 ) 


In this Appendix we solve the model of cell-specific fit¬ 
ness fluctuations in the limit where trait heritability is 
low. In this limit, the dynamics is described by a model 
with an instantaneous random fitness that is uncorre¬ 
lated for cells in the same clone. The resulting Langevin 
equation reads: 


— foCi + y/^Ci^rji + + V (l 1 + v)Citii (Jl) 

where all noise is treated in the Ito convention, and 
where the extra term 7 ^/A 2 comes the converting back 
the low-heritability limit of the fitness fluctuations, given 
by Eq. |I 12 [ into C = e x space. We note that although the 
fitness and birth-death noise have very similar forms, the 
birth-death noise is self-generated and intrinsic, while the 
fitness noise is environmental and extrinsic. This small 
difference greatly affects the steady-state clone size dis¬ 
tribution. 

To see this, we first consider the case of no birth-death 
noise. In the cell-specific fitness model consider the fol¬ 
lowing equations with the Stratanovich rule: 


dtCimfoCi + fCi, 

d t fi = -A ft + Y|- 7 c?7i, 


and its equivalent for x = log(C) 


d t Xi = fo + fi, 

dtfi = -A fi + e~ x d 2 ^ c rji 


(J2) 


V2 'jcVtK ie)e 


(J3) 
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In Appendix [T] we have shown that in the limit of A —>■ oo 
and 7 C ^ oo, the system reduces to the one dimensional 
equation 


d t Xi = fo + e Xi/2 V2y? 7 , (J4) 


with the Ito rule for the white noise The correspond¬ 
ing Fokker-Planck equation is 


dtp = d x (-f 0 p) + -d 2 x 


27c -x 

A^ ' 


+ s(x). (J5) 


Assuming a deterministic introduction size s(x) = 
sc^(x — x 0 ), at steady-sate we get 


K - s c 0(x - x 0 ) = -foP + e x -^p' 


Is. -x 


(J6) 


which for x > xq is solved by 



p(x) 


0 -e*/C m +x 


KEi(e x /C m ) - KEiiC- 1 ) (J7) 

,Xq 


s c A 2 e x s c A 2 e 

Ic Ic ^rn 


(J8) 


where K is an integration constant, Ei is the exponential 
integral function and 


C m = 


7c 


l/o|A 2 ' 


(J9) 


The divergence of Ei at infinity sets K = scA 2 /(7 2 ) and 
the clone size distribution is 

, v _ f {Ei(e x /Cm) ~ ^(C" 1 )) e -e x /Cm+x £ or x < 

^ | (Ei(e x °/Cm) — Ei(C~ x )) e ~ eXCrn + x for x > x 0 

(J10) 

or in terms of x = log C 

W) = f e~ c ! Cm (. Ei(C/C m ) - ^(C- 1 )) for C < C 0 
P (. Ei(e x °/C m ) - ^(C- 1 )) for C > C 0 

(Jll) 

The validity of this solution is checked in Fig.[S9]and the 
convergence of the full solution of Eq. [fo] (with no birth- 
death noise) to the analytical solution in the limit of no 
heritability (A —)> 00 ) is show in Fig. |S10| 

For comparison, in a pure birth-death process (no 
fitness fluctuations) the clone-size distribution is, for 
C large enough, p{C) ~ e~ c ^ Crn /C where C m = 
(/i + u)/(2(/jL — 1 /)), as shown in Appendix |A| These two 
solutions both have an exponential cutoff, but have very 
different power-law exponents, corresponding to a = 0 
and a = — 1, respectively. 

We now add the birth-death noise, i.e. consider both 
types of noise, still in the limit of no heritability. The 
corresponding Langevin equation reads: 


d t Xi = f 0 + s/JT+ve Xi/2 £ - e 


- Xi M + v 


a Xi/2 


A 

(J12) 


V 


FIG. S9: The result of a simulation of the Langevin equation 
of the white noise cell-specific fitness mo del ( blue line) com¬ 
pared to the analytical prediction of Eq. Jll (red line) show 
very good agreement. The parameters are p = 0.2 day -1 , 
v — 0.4 day -1 , Co = 2, A = 4 day -1 and y c = 8 day -3 / 2 . 



FIG. S10: Convergence of the cell-specif ic fit ness models 
(Eq. [fo]) without birth-death noise to Eq. Jll in the limit 
of no heritability (A —>• 00 ). For all four curves a — 0.2. Pa¬ 
rameters used: p = 0.2 day -1 , v — 0.25 day -1 , Co = 2 and 
1000 new clones introduced each day. 


where all noise is in the Ito convention. Integrating the 
Fokker Planck associated to this equation gives at steady 
state condition 


K-s c 0(x-x 0 ) = -foP+ 


, 7c 

2 A 2 


^ 2 

e~ x p'-fe- x p. 


( J13 ) 

In order for p to be well defined we set K = sc- For x > 
xq the equation is homogeneous and solved by separation 
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of variables: 


—e -1 

P 


P + V , 7c 


2 A 2 
and gives the solution: 

p(C) = 


with 


a = - 1 + 


~ [ fo + ^|e * ) p, 


Ke~ c / c ” 


( 71 +a ’ 


(/i + 

27c 2 


(J14) 


(J15) 


(J16) 


which is a power-law with an exponent 0 < 1 + a < 1 
and an exponential cutoff 


FIG. Sll: Convergence of the cell-specific models (Eq . |Tg| ) 
with birth-death noise to the analytical result of Eq. |J15| (red 
line). Keeping constant a while A —>> oo and y c —>• oo we 
recover the solution of Eq. |J15| Parameters are the same as 
in Fig. [S10] 


+ P17) 

The convergence of the solution of the full system, Eq. [I6j 
to this solution is checked in Fig. m 
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